cochrans_q: Cochran's Q test for comparing multiple classifiers
複数のアルゴリズムを同時に比較するのに使う手法
全部同じ(帰無仮説) or 違いがある(違いがある場合は後続の検定を行う)
In a sense, Cochran's Q test is analogous to ANOVA for binary outcomes.
「ある意味で、CochranのQ検定は2値の結果についてANOVAと類似している」
Cochran's Q test tests the hypothesis that there is no difference between the classification accuracies: pi:H0=p1=p2=⋯=pL.
CochranのQ検定では、L個の分類器のaccuracy p_iに違いはないという帰無仮説H0を検定する
TODO:数式を確認したい
分類器はL個 → 統計量Qは自由度L-1のχ2乗分布に従う
code:example1.py
>> import numpy as np
>> from mlxtend.evaluate import cochrans_q
>> y_true = np.array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
... 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
... 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
... 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
... 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
... 0, 0, 0, 0, 0])
>> y_model_1 = np.array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0,
... 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
... 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
... 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
... 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
... 0, 0])
>> y_model_2 = np.array([1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
... 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
... 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
... 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
... 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
... 0, 0])
>> y_model_3 = np.array([1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
... 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
... 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
... 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
... 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
... 1, 1])
>> # 有意水準 alpha = 0.05
>> q, p_value = cochrans_q(y_true, y_model_1, y_model_2, y_model_3)
>> q
7.529411764705882
>> p_value # p_value < 0.5 より「どの分類器のaccuracyも等しい」という帰無仮説は棄却される
0.023174427241061245
>> # multiple post hoc pair-wise tests へ
let's illustrate that Cochran's Q test is indeed just a generalized version of McNemar's test:
code:cochran_q_and_mcnemar.py
>> from mlxtend.evaluate import mcnemar, mcnemar_table
>> chi2, p_value = cochrans_q(y_true, y_model_1, y_model_2)
>> chi2
5.333333333333333
>> p_value
0.020921335337794035
>> mcnemar(mcnemar_table(y_true, y_model_1, y_model_2), corrected=False) # 同じ!
(5.333333333333333, 0.020921335337794035)
>> cochrans_q(y_true, y_model_1, y_model_2) == mcnemar(mcnemar_table(y_true, y_model_1, y_model_2), corrected=False)
True
>> mcnemar(mcnemar_table(y_true, y_model_1, y_model_2), corrected=True)
(4.083333333333333, 0.04330814281079206)